This reports explores the data from the Human Development Report, UNDP. The data set contains data for 194 countries from 1990 until 2015. The selection of variables are the HDI (Human Development Index), its 3 components (Education , Life Expectancy and Income Index), as well as the Male and Female versions of the HDI.

More information can be found here: http://dev-hdr.pantheonsite.io/sites/default/files/hdr2016_technical_notes_0.pdf

and the data sources here: http://hdr.undp.org/en/data https://unstats.un.org/unsd/methodology/m49/overview/

The data for this report was prepared in using the file Prep Data

1 Univariate Plots Section

1.1 General

First look at the data: 4518 observations, in a unbalanced Panel (not all data available for all countries/year pair)

## 'data.frame':    4518 obs. of  10 variables:
##  $ Country                       : Factor w/ 194 levels "Afghanistan",..: 1 2 3 7 8 9 10 13 14 15 ...
##  $ year                          : int  1990 1990 1990 1990 1990 1990 1990 1990 1990 1990 ...
##  $ Human Development Index       : num  0.295 0.635 0.577 0.705 0.634 0.866 0.794 0.745 0.386 0.714 ...
##  $ Life Expectancy Index         : num  0.459 0.797 0.719 0.793 0.737 0.874 0.853 0.806 0.591 0.791 ...
##  $ Education Index               : num  0.122 0.569 0.385 0.628 0.634 0.873 0.676 0.574 0.252 0.625 ...
##  $ Income Index                  : num  0.458 0.565 0.694 0.705 0.545 0.85 0.867 0.895 0.386 0.738 ...
##  $ Human Development Index Male  : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Human Development Index Female: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Region                        : Factor w/ 5 levels "Africa","Americas",..: 3 4 1 2 3 5 4 3 3 2 ...
##  $ Development Classification    : Factor w/ 2 levels "Developed","Developing": 2 1 2 2 2 1 1 2 2 2 ...
##         Country          year      Human Development Index
##  Afghanistan:  26   Min.   :1990   Min.   :0.1940         
##  Albania    :  26   1st Qu.:1997   1st Qu.:0.5140         
##  Algeria    :  26   Median :2004   Median :0.6745         
##  Argentina  :  26   Mean   :2003   Mean   :0.6478         
##  Armenia    :  26   3rd Qu.:2010   3rd Qu.:0.7800         
##  Australia  :  26   Max.   :2015   Max.   :0.9490         
##  (Other)    :4362                  NA's   :170            
##  Life Expectancy Index Education Index   Income Index   
##  Min.   :0.177         Min.   :0.0810   Min.   :0.0870  
##  1st Qu.:0.659         1st Qu.:0.4570   1st Qu.:0.5132  
##  Median :0.783         Median :0.6260   Median :0.6760  
##  Mean   :0.751         Mean   :0.5959   Mean   :0.6603  
##  3rd Qu.:0.855         3rd Qu.:0.7270   3rd Qu.:0.8060  
##  Max.   :0.987         Max.   :0.9390   Max.   :1.0000  
##  NA's   :2618          NA's   :2748     NA's   :2592    
##  Human Development Index Male Human Development Index Female
##  Min.   :0.304                Min.   :0.118                 
##  1st Qu.:0.587                1st Qu.:0.527                 
##  Median :0.727                Median :0.695                 
##  Mean   :0.705                Mean   :0.661                 
##  3rd Qu.:0.820                3rd Qu.:0.805                 
##  Max.   :0.951                Max.   :0.944                 
##  NA's   :3277                 NA's   :3279                  
##       Region     Development Classification
##  Africa  :1249   Developed :1144           
##  Americas: 823   Developing:3374           
##  Asia    :1163                             
##  Europe  :1014                             
##  Oceania : 269                             
##                                            
## 

1.2 Data balancing

First let’s look at the balancing of the of data in terms of Countries, Development Classification and Region:

For Countries a table of available entries is more informative:

## # A tibble: 20 × 2
##                Country     n
##                 <fctr> <int>
## 1          Afghanistan    26
## 2              Albania    26
## 3              Algeria    26
## 4              Andorra    10
## 5               Angola    19
## 6  Antigua and Barbuda    14
## 7            Argentina    26
## 8              Armenia    26
## 9            Australia    26
## 10             Austria    26
## 11          Azerbaijan    22
## 12             Bahamas    18
## 13             Bahrain    26
## 14          Bangladesh    26
## 15            Barbados    26
## 16             Belarus    22
## 17             Belgium    26
## 18              Belize    26
## 19               Benin    26
## 20              Bhutan    10

What is the average count of entries per country?

##        n        
##  Min.   :10.00  
##  1st Qu.:22.00  
##  Median :26.00  
##  Mean   :23.29  
##  3rd Qu.:26.00  
##  Max.   :26.00

I see that most countries have 26 entries in the data set (yearly from 1990 until 2015)

Let’s see the data distribution in terms of Development Classification and Region:

About 25% of countries are classified as Developed.

Let’s also see which data is available in each year:

HDI is available for most countries in each year.

Life Expectancy, Education and Income Indices are available for 1990, 1995, 2000 , 2005, 2010-2015.

HDI male/female is available for 2000, 2005, 2010-2015

1.3 Main data

Now let’s explore the main data. First lets look at the HDI and its 3 components:

I notice here in the HDI and Education a bi-modal distribution. Life Expectancy is mostly concentrated around 0.8-0.9 Income is mostly spread with some peak at 1, which is the cap value.

Univariate Analysis

What is the structure of your dataset?

The data set has 4518 observation in 10 variables. The primary key of this data set is the combination Country + year. There are 194 Countries with yearly data from 1990 up to 2015

The other variables are the Human Development Index and its 3 components ( Life Expectancy, Education and Income Indices), the HDI for males and for females, Regional classification and Development classification.

Not all data is available for all countries in all years. HDI is available for most countries in each year.

Life Expectancy, Education and Income Indices are available for 1990, 1995, 2000 , 2005, 2010-2015.

HDI male/female is available for 2000, 2005, 2010-2015

What is/are the main feature(s) of interest in your dataset?

I am interested in (1) the global distribution of HDI and (2) how HDI developed over time.

What other features in the dataset do you think will help support your
investigation into your feature(s) of interest?

The components of the HDI are of interest, and how they correlate. The regional and development classifications will also support the analysis

Did you create any new variables from existing variables in the dataset?

I created summaries per year, and year x region, year x development classifications, a subset of selected countries and one subset per each selected country (which will be used further in the analysis)

The summaries are to allow plotting over year, grouped by region or development classification. The subset of selected countries is to plot by country, but just a few to enable visual inspection. The individual country subset is allow to compare (in the next sections) different variables at country level.

Of the features you investigated, were there any unusual distributions?
Did you perform any operations on the data to tidy, adjust, or change the form
of the data? If so, why did you do this?

I only created summaries, in order to be able to plot time-series for groups or selected countries. The indices are already normalized and scaled, so I did not expect to need any further adjustments.

2 Bivariate Plots Section

2.1 Distribution within groups

I noticed that HDI and Education had a bi-modal distribution. This could suggest two groups of countries clustered around two peaks.

2.1.1 Distribution within Development Classifications

What would these graphs look like by Development classification?

I see that the Developed countries clear have higher values, but the bi-modality of the HDI still shows within the Developing Countries.

2.1.2 Distribution within Regions

Let’s see what the distributions are by Region:

I notice that, generally speaking, Europe has higher values and Africa lower values.

See averages:

## # A tibble: 5 × 7
##     Region `Human Development Index` `Education Index`
##     <fctr>                     <dbl>             <dbl>
## 1   Africa                 0.4715079         0.4171154
## 2 Americas                 0.6916205         0.6327622
## 3     Asia                 0.6598671         0.5911469
## 4   Europe                 0.8062419         0.7772640
## 5  Oceania                 0.6785859         0.6675204
## # ... with 4 more variables: `Life Expectancy Index` <dbl>, `Income
## #   Index` <dbl>, `Human Development Index Male` <dbl>, `Human Development
## #   Index Female` <dbl>
## # A tibble: 2 × 7
##   `Development Classification` `Human Development Index` `Education Index`
##                         <fctr>                     <dbl>             <dbl>
## 1                    Developed                 0.8151860         0.7873423
## 2                   Developing                 0.5899164         0.5318575
## # ... with 4 more variables: `Life Expectancy Index` <dbl>, `Income
## #   Index` <dbl>, `Human Development Index Male` <dbl>, `Human Development
## #   Index Female` <dbl>

I see a curious spike of Income index at 1 in Asia. This seems to be correct, as there is a cap of index at 1.

2.2 Scatter plots

I know turn to investigate the HDI and its components.

Since the HDI is the simple average of it’s 3 components, I plot a reference line from which the dots can never fall under. I add also a best fit loess curve.

I notice a slight concave relationship between HDI and the Life expectancy index and a fairly linear relation between HDI and Education and Income index.

Let’s compute the correlations

## 
##  Pearson's product-moment correlation
## 
## data:  HDR$`Human Development Index` and HDR$`Life Expectancy Index`
## t = 92.972, df = 1768, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9028863 0.9187384
## sample estimates:
##       cor 
## 0.9111488
## 
##  Pearson's product-moment correlation
## 
## data:  HDR$`Human Development Index` and HDR$`Education Index`
## t = 125.85, df = 1768, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9435662 0.9529423
## sample estimates:
##       cor 
## 0.9484614
## 
##  Pearson's product-moment correlation
## 
## data:  HDR$`Human Development Index` and HDR$`Income Index`
## t = 110.83, df = 1768, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9288433 0.9405906
## sample estimates:
##       cor 
## 0.9349728

The close to 1 correlation suggests a high correlation among components, since the HDI is the simple average of its components.

Moreover, since all dots fall clearly above the reference line, this also suggests a strong correlation among the components.

I plot also the scatter plots among the components, and add the x=y reference line.

Let’s compute the correlations:

## 
##  Pearson's product-moment correlation
## 
## data:  HDR$`Education Index` and HDR$`Life Expectancy Index`
## t = 57.277, df = 1768, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7891524 0.8218312
## sample estimates:
##       cor 
## 0.8061055
## 
##  Pearson's product-moment correlation
## 
## data:  HDR$`Income Index` and HDR$`Education Index`
## t = 58.264, df = 1768, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7943096 0.8262668
## sample estimates:
##       cor 
## 0.8108919
## 
##  Pearson's product-moment correlation
## 
## data:  HDR$`Life Expectancy Index` and HDR$`Income Index`
## t = 59.074, df = 1884, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7894348 0.8211260
## sample estimates:
##       cor 
## 0.8058567

Therefore all components seem highly correlated to each other, with Income and Education slightly stronger.

2.3 Evolution over time

It is also interesting to see how the averages evolved over time:

I notice an overall positive trend, without any index overtaking the other over time. Reinforcing what I saw before, Life Expectancy Index is the highest components, all components seem strongly correlated to each other and to the HDI.

The Income Index seems like the best predictor of the HDI.

2.4 Country Case studies

Finally, it is interesting to see the evolution of the indices in a few countries, as a case study. I look at the Brazil, Netherlands (both my home countries), Japan, Kenya and Australia (one per other region) and Syria (special case)

I notice here the similar overall positive trend. Syria is an interesting case, and due to war I notice a sharp drop from 2010. Kenya has a negative trend until 2000 and start improving since.

I notice also big spread in the values, with Netherlands on top and Kenya at the bottom.

Bivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation. How did the feature(s) of interest vary with other features in
the data set?

I notice 3 main things: 1. Typically Europe has high values and Africa low values for the indices 2. Overtime all indices grow and the rank order stays 3. All three sub-components are highly correlated with the HDI, and the value of the Income Index is the best predictor of the HDI.

Did you observe any interesting relationships between the other features
(not the main feature(s) of interest)?

I notice that all 3 components are highly correlated to each other (>0.80). HDI and Life Expectancy seem to have a concave relationship, indicating that at lower levels of Life Expectancy, marginal increases are not yet enough to improve much the HDI.

What was the strongest relationship you found?

HDI and Income Index. Even though the correlation coefficient of HDI and the Education Index is slightly stronger, the Income Index average is almost identical to the HDI value.

3 Multivariate Plots Section

3.1 Scatter plots

We repeat the previous scatter plots, but now we add the groupings:

3.2 Development Classification

We clearly see that the Developed countries are on the top right corner of all graphs.

3.2 Regions

We clearly see, confirming the previous graphs, that the Europe is on the top right corner of all graphs, with Africa at the bottom.

3.1 Components

We noticed a strong correlation among the 3 components.

To investigate this, we plot Life Expectancy against Education and use Income as the color. We also plot it by region and development classification

These graphs clearly show the strong correlation among all components.

The regional grouping just reinforce what we saw before: high values for Europe, low values for Africa.

3.2 Evolution over time by group

It is also interesting to see how this evolved over time within group.

HDI

Education Index

Life Expectancy Index

Income Index

HDI Male

HDI Female

I notice, as before, an overall positive trend, without Regions nor Development Categories overtaking each other. The one exception seems to be the HDI of Oceania which drops from 2nd place in 1990 to 4th place in 2015.

Europe ranks consistently at the top, and Africa at the bottom. See the comparsion of 1990 with 2015:

## Source: local data frame [5 x 8]
## Groups: year [1]
## 
##    year   Region `Human Development Index` `Education Index`
##   <int>   <fctr>                     <dbl>             <dbl>
## 1  1990   Africa                 0.4227632         0.3046842
## 2  1990 Americas                 0.6255556         0.5181852
## 3  1990     Asia                 0.5880526         0.4560526
## 4  1990   Europe                 0.7509714         0.6468571
## 5  1990  Oceania                 0.6551667         0.6193333
## # ... with 4 more variables: `Life Expectancy Index` <dbl>, `Income
## #   Index` <dbl>, `Human Development Index Male` <dbl>, `Human Development
## #   Index Female` <dbl>
## Source: local data frame [5 x 8]
## Groups: year [1]
## 
##    year   Region `Human Development Index` `Education Index`
##   <int>   <fctr>                     <dbl>             <dbl>
## 1  2015   Africa                 0.5299623         0.4612453
## 2  2015 Americas                 0.7418000         0.6728286
## 3  2015     Asia                 0.7209583         0.6405833
## 4  2015   Europe                 0.8554146         0.8244390
## 5  2015  Oceania                 0.6960909         0.6852727
## # ... with 4 more variables: `Life Expectancy Index` <dbl>, `Income
## #   Index` <dbl>, `Human Development Index Male` <dbl>, `Human Development
## #   Index Female` <dbl>
## Source: local data frame [2 x 8]
## Groups: year [1]
## 
##    year `Development Classification` `Human Development Index`
##   <int>                       <fctr>                     <dbl>
## 1  1990                    Developed                 0.7622750
## 2  1990                   Developing                 0.5290865
## # ... with 5 more variables: `Education Index` <dbl>, `Life Expectancy
## #   Index` <dbl>, `Income Index` <dbl>, `Human Development Index
## #   Male` <dbl>, `Human Development Index Female` <dbl>
## Source: local data frame [2 x 8]
## Groups: year [1]
## 
##    year `Development Classification` `Human Development Index`
##   <int>                       <fctr>                     <dbl>
## 1  2015                    Developed                 0.8623696
## 2  2015                   Developing                 0.6458944
## # ... with 5 more variables: `Education Index` <dbl>, `Life Expectancy
## #   Index` <dbl>, `Income Index` <dbl>, `Human Development Index
## #   Male` <dbl>, `Human Development Index Female` <dbl>

There is some apparent inconsistency: Oceania HDI is 0.69 in 2015, but HDI male and HDI female are both above 0.84. The HDI should be in between male and female . This is explained by the fact that some countries with low HDI have missing values for HDI male and female

3.3 Models

We put it all together in a panel model.

First in a random effects model in order to account for the regional grouping.

## Oneway (individual) effect Random Effect Model 
##    (Swamy-Arora's transformation)
## 
## Call:
## plm(formula = HDR$`Human Development Index` ~ HDR$`Life Expectancy Index` + 
##     HDR$`Income Index` + HDR$Region, data = HDR, model = "random", 
##     index = c("Country", "year"))
## 
## Unbalanced Panel: n=81, T=5-26, N=1770
## 
## Effects:
##                     var   std.dev share
## idiosyncratic 1.089e-03 3.300e-02 0.994
## individual    6.325e-06 2.515e-03 0.006
## theta  : 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01421 0.06078 0.06791 0.06186 0.06791 0.06791 
## 
## Residuals :
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -0.116753 -0.022287  0.004045  0.000018  0.023260  0.082693 
## 
## Coefficients :
##                                Estimate  Std. Error t-value  Pr(>|t|)    
## (Intercept)                 -0.03735173  0.00551788 -6.7692 1.757e-11 ***
## HDR$`Life Expectancy Index`  0.46617730  0.01175431 39.6601 < 2.2e-16 ***
## HDR$`Income Index`           0.50476534  0.00786496 64.1790 < 2.2e-16 ***
## HDR$RegionAmericas           0.01434739  0.00317149  4.5239 6.477e-06 ***
## HDR$RegionAsia               0.00060373  0.00286004  0.2111    0.8328    
## HDR$RegionEurope             0.03651343  0.00343599 10.6268 < 2.2e-16 ***
## HDR$RegionOceania            0.04389959  0.00425229 10.3237 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    47.077
## Residual Sum of Squares: 2.1636
## R-Squared:      0.95404
## Adj. R-Squared: 0.95027
## F-statistic: 6099.67 on 6 and 1763 DF, p-value: < 2.22e-16

Then in a fixed effect model

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = HDR$`Human Development Index` ~ HDR$`Life Expectancy Index` + 
##     HDR$`Income Index` + HDR$Region, data = HDR, model = "within", 
##     index = c("Country", "year"))
## 
## Unbalanced Panel: n=81, T=5-26, N=1770
## 
## Residuals :
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.1003104 -0.0202519  0.0029314  0.0221676  0.0949724 
## 
## Coefficients :
##                              Estimate Std. Error t-value  Pr(>|t|)    
## HDR$`Life Expectancy Index` 0.4176441  0.0117657 35.4967 < 2.2e-16 ***
## HDR$`Income Index`          0.5130348  0.0075214 68.2097 < 2.2e-16 ***
## HDR$RegionAmericas          0.0231396  0.0031148  7.4288 1.734e-13 ***
## HDR$RegionAsia              0.0059363  0.0028436  2.0876   0.03698 *  
## HDR$RegionEurope            0.0479038  0.0033645 14.2378 < 2.2e-16 ***
## HDR$RegionOceania           0.0519161  0.0041140 12.6194 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    44.057
## Residual Sum of Squares: 1.8325
## R-Squared:      0.95841
## Adj. R-Squared: 0.9113
## F-statistic: 6463.39 on 6 and 1683 DF, p-value: < 2.22e-16

And we select FE based on the Hausman test:

## 
##  Hausman Test
## 
## data:  HDR$`Human Development Index` ~ HDR$`Life Expectancy Index` +  ...
## chisq = 584.13, df = 6, p-value < 2.2e-16
## alternative hypothesis: one model is inconsistent

The Fixed Effect model confirms the strong correlation between components.

However, since the HDI is an average of its components the result is not surprising, but more confirming. Therefore it is also interesting to run the models only based on the groupings :

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = HDR$`Human Development Index` ~ HDR$Region, data = HDR, 
##     model = "within", index = c("Country", "year"))
## 
## Unbalanced Panel: n=187, T=10-26, N=4348
## 
## Residuals :
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -0.317610 -0.071566 -0.001540  0.062106  0.278657 
## 
## Coefficients :
##                     Estimate Std. Error t-value  Pr(>|t|)    
## HDR$RegionAmericas 0.2234687  0.0051786  43.152 < 2.2e-16 ***
## HDR$RegionAsia     0.1861820  0.0047754  38.988 < 2.2e-16 ***
## HDR$RegionEurope   0.3366026  0.0048258  69.751 < 2.2e-16 ***
## HDR$RegionOceania  0.2082969  0.0081909  25.430 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    114.78
## Residual Sum of Squares: 51.356
## R-Squared:      0.55259
## Adj. R-Squared: 0.52831
## F-statistic: 1283.54 on 4 and 4157 DF, p-value: < 2.22e-16
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = HDR$`Human Development Index` ~ HDR$`Development Classification`, 
##     data = HDR, model = "within", index = c("Country", "year"))
## 
## Unbalanced Panel: n=187, T=10-26, N=4348
## 
## Residuals :
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -0.331641 -0.099554  0.019091  0.092464  0.306136 
## 
## Coefficients :
##                                              Estimate Std. Error t-value
## HDR$`Development Classification`Developing -0.2277917  0.0046049 -49.467
##                                             Pr(>|t|)    
## HDR$`Development Classification`Developing < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    114.78
## Residual Sum of Squares: 72.272
## R-Squared:      0.37037
## Adj. R-Squared: 0.35435
## F-statistic: 2447.01 on 1 and 4160 DF, p-value: < 2.22e-16

Results confirm that developed countries have higher HDI, African countries are at the bottom and European at the top.

Finally, we saw the the Income Index was a good predictor of HDI:

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = HDR$`Human Development Index` ~ HDR$`Income Index`, 
##     data = HDR, model = "within", index = c("Country", "year"))
## 
## Unbalanced Panel: n=81, T=5-26, N=1770
## 
## Residuals :
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.2161860 -0.0311076  0.0089113  0.0372149  0.1382070 
## 
## Coefficients :
##                     Estimate Std. Error t-value  Pr(>|t|)    
## HDR$`Income Index` 0.8208594  0.0073159   112.2 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    44.057
## Residual Sum of Squares: 5.2089
## R-Squared:      0.88177
## Adj. R-Squared: 0.84092
## F-statistic: 12589.3 on 1 and 1688 DF, p-value: < 2.22e-16

The results confirm that.

Multivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation. Were there features that strengthened each other in terms of
looking at your feature(s) of interest?

We confirmed the strong correlation between all components and the regional breakdown.

Were there any interesting or surprising interactions between features?

It was uprising to see that the rankings rarely changed. Moreover, the Income Index is highly predicting the HDI, which suggests that only looking at that would be enough.

OPTIONAL: Did you create any models with your dataset? Discuss the
strengths and limitations of your model.

I run a few panel models to confirm the effects seen in the graphs. All initial indications were supported.


Final Plots and Summary

Plot One

Description One

I like the graph above because it is quite simple, yet, highly informative.

It shows:

  • Positive trend of HDI
  • Similar trend of its components over time
  • Life Expectancy as the highest index
  • Education as the lowest
  • No change in ranks
  • Income index as the individual best predictor of HDI.

Plot Two

Description Two

This plot shows quite clearly the inequality across regions.

It shows that African countries are still low in the HDI and Life Expectancy indices, while European countries are at the top. It also reinforces the fact that the components are highly correlated.

Plot Three

Description Three

This graph combines insights from the previous two: * Africa at the bottom, Europe at the top * Positive trend over time

But it adds the insight of a lack of change over time. The gap between regions did not close.


Reflection

In this exercise we looked into the HDI and its components. I found out that the 3 components (Life Expectancy, Education and Income Indices) are highly correlated to each other, and the Income Index is the best predictor of the HDI.

Moreover, regional differences are strong (Africa versus Europe) and they persist over time. The gap between countries did not close.

The country case studies also show some interesting effects, such as Syria having a big fall in HDI values after the war period.

For further analysis it would be nice to investigate more parts of the HDI and subgroups of countries, such as eastern European countries before and after liberalization. More over, the other indices from the HDR (such as Gender, Poverty, Inequality adjusted) could also be explored in similar manner.

On a higher level, you could discuss the use of the HDI if Income Index seems to be enough: is it just that Income and the other go hand-in-hand or is the Human Development Report initiative missing important parts of the puzzle?